-
Notifications
You must be signed in to change notification settings - Fork 170
feat: support multiple disk selection for one partition when using local storage #435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This is a draft to support multiple disk selection for one partition. @jerqi @advancedxy @xianjingfeng If the design is OK, I will go ahead. |
I would take a look this weekend. |
7f5aeb6 to
69b8a64
Compare
|
No. Currently storage of one partition only will be switched when disk reaches high watermark.
Yes, this could support in the future.
No. If the disk space is small and partitions are many, this feature could make full use of local disk instead of fall back to HDFS |
If we just support switch to another disk when disk reaches high watermark, i think it is unnecessary to let the client to decide whether to write to multiple disks. I just want to make the large partition write faster, so I hope to support concurrent disk writing for a partition. |
I think this feature you mentioned is also meaningful, I also have similar thought. Concurrent writing could be supported later. In current PR, I introduce the |
|
Is this pr ready for review? |
No, I will do some change. |
| int64 offset = 6; | ||
| int32 length = 7; | ||
| int64 timestamp = 8; | ||
| int32 storageId = 9; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a good design choice. It's leaking too much details to client.
And there're other storage types that doesn't need local storage, such as memory and memory_hdfs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we just reuse the meta data in
private final Map<PartitionUnionKey, ChainableLocalStorage> partitionsOfLocalStorage;?
Or another way would be that look up all the disk dirs to find the correct storage path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me think twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original shuffle-data reading will follow the rule
- Reading the remote whole index file to split and filter to get the required segments
- Reading the shuffle-data according to above segment's offset and length one by one
If we expose the unified abstraction for client to obey above reading sequence, it means we have to compose multiple files into abstract one and re-calculate the offset and length for every request to map the real file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a good design choice. It's leaking too much details to client.
Yes. I also prefer giving a unified abstraction to hide the multiple under storages' detail for client, but currently I have no good ideas on this.
And there're other storage types that doesn't need local storage, such as memory and memory_hdfs.
Emmm... Only localfile storage will use this proto. Memory reading will use an independent api and HDFS reading will fetch data directly from HDFS datanode instead of fetch remote data from shuffle-server.
| int64 offset = 6; | ||
| int32 length = 7; | ||
| int64 timestamp = 8; | ||
| int32 storageId = 9; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid using storageId. It expose too many details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet.
69b8a64 to
dcde67e
Compare
|
I have updated the code, which is a little bit different with previous version commit in LocalStorageManager part. I introduce the |
dcde67e to
6955490
Compare
Codecov Report
@@ Coverage Diff @@
## master #435 +/- ##
============================================
- Coverage 58.62% 58.55% -0.08%
- Complexity 1642 1658 +16
============================================
Files 199 203 +4
Lines 11173 11274 +101
Branches 989 997 +8
============================================
+ Hits 6550 6601 +51
- Misses 4231 4282 +51
+ Partials 392 391 -1
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
|
This is ready for review? Seems like we haven't get rid of storageId in proto. I haven't thought it thoroughly, but it should be possible to eliminate the need of storageId. |
Yes. Ready to review. |
Looking forward to better design. |
I went through this PR today. It may require some id to indicate which shuffle_data/shuffle_index current points, but I don't think it's a good design to have the client passing around the storageIndex/storageId. One possible solution would be similar with how hdfs handles multiple IndexFile/DataFile.
In step 2 and 3, you may pass a |
|
It could be, but mostly it would be the base path of shuffleDataFile.. As the |
Concrete data file name is necessary which could not be generated by client, because there are multi files for one partition. |
I see, make sense. |
What changes were proposed in this pull request?
To support multiple disk selection for one partition when using local storage
StorageSelectorpluggable selector to support different selection strategy, such as multiple disk selection strategy(ChainableLocalStorageSelector) or concurrent strategy.LocalFileClientReadMultiFileHandlerto manage multiple handlers, every handler is bound to every disk's data file and index file.Why are the changes needed?
Does this PR introduce any user-facing change?
Yes.
How was this patch tested?